## -------------------------------------------------- #
## BuenoNunesZucco_CPS_README
## -------------------------------------------------- #

 Date: 2024-09-03

 Authors: Natália S. Bueno, Felipe Nunes, and Cesar Zucco

 Title: Benefits by luck: A study of lotteries as a selection method for government programs
 
Contact Information: 
   Natália S. Bueno <natalia.bueno@emory.edu>
   Felipe Nunes <felipnunes@gmail.com>
   Cesar Zucco <cesar.zucco@fgv.br>
   
	
 Copyright (c) 2024, under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License.
 For more information see: http://creativecommons.org/licenses/by-nc-sa/3.0/us/
 All rights reserved. 


## -------------------------------------------------- #

This file describes the contents of the replication archive used to conduct the analyses in the main text and appendix. 


## -------------------------------------------------- #
## install R and necessary packages for analysis
## -------------------------------------------------- #

install R and RStudio

install packages if necessary. See file packages.html in Code folder for information on package versions and session info.

save the replication files locally, preserving the folder structure in the replication materials. The replication code code assumes a certain folder (directory) structure. As long as the folders are in the R working directory the script will find these files and work properly. 

Click on the .Rproj file to open the R project in RStudio and then you can run any of the .R files.

Run the replication files (within Code folder), one at a time, following their numbers at the start of the file names, beginning with number 1 and ending with number 4. 

Code files that start wit "_" are not necessary for replication


## -------------------------------------------------- #
## hardware and software 
## -------------------------------------------------- #

The last version of R and Mac OS-X at the time the paper was published are:

R version 4.3.1 (2023-06-16) -- "Beagle Scouts"Copyright (C) 2023 The R Foundation for Statistical ComputingPlatform: aarch64-apple-darwin20 (64-bit)

All models were estimated on a Mac mini, running macOS Ventura (13.6.4).

## -------------------------------------------------- #
# file folder descriptions
## -------------------------------------------------- #

To successfully run this replication materials, we suggest the keeping the
folder structure as: 

Code (for the .R scripts)
Data (for the different data files)
Routputs (for the .RData files)
Tables (for the .tex files)
Figures (for the .pdf files)
Questionnaires-InterviewScript (questionnaires for W1, W2, and qualitative interviews)

Also, use the Replication-Lotteries.Rproj in your main folder to set your R project. 


## -------------------------------------------------- #
# file folder descriptions
## -------------------------------------------------- #
 
Codebook.pdf --- Codebook describing all variables in the datasets used in the analysis of the manuscript and appendix


code ---- folder containing the following script files:

	
	functions.R: 
		R file with functions used in creating data, main paper and appendix. 
		These functions are called from the other routines
	
	
	_creating_data_W1_public.R: 
		R file that creates the dataset used in the W1 analyses. 
		This script is not replicable because it uses the raw data with identifiable information which is not shared
		Produces the anonymized datasets that are the starting point for the replication. 
   
   _creating_data_W2_public.R: 
		R file that creates the dataset used in the W2 analyses. 
		This script is not replicable because it uses the raw data with identifiable information which is not shared
		Produces the anonymized datasets that are the starting point for the replication. 
	
	1_analysis_W1.R: 
		R file that recodes W1 data and estimates for W1 analysis
		Requires the dataset produced by _creating_data_W1_public.R that is made available
	
	1_analysis_W2.R: 
		R file that recodes W2 data and estimates for W2 analysis
		Requires the dataset produced by _creating_data_W2_public.R that is made available

	1_analysis_WW.R:
		R file that recodes (non-accessible) data for Table C.1. See note below. 
  	
  	2_analysis_output_W1.R: 
		R file that produces tables, figures, and data cited in main paper and in online appendix from W1
		Should be ran after the previous routines or it can be ran independently from any other code
		because outputs from 1_analysis_W1.R are provided
	
	2_analysis_output_W2.R: 
		R file that produces tables, figures, and data cited in main paper and in online appendix from W2
		Should be ran after the previous routines or it can be ran independently from any other code
		because outputs from 1_analysis_W2.R are provided
	
	2_analysis_survey_exp.R: 
		R file that produces tables, figures, and data cited in main paper and in online appendix from the survey

	2_figure-1.R 	
		R file that produces figure 1


Figures --- folder contains all figures as pdf files

	All figures are provided, but can be re-generated by running the code, above


Tables --- folder contains all table outputs as tex files

	All tables are provided, but can be re-generated by running the code, above

Questionnaires-InterviewScripts: Questionnaires and Interview Scripts in Portuguese

Routputs --- folder contains estimates from analyses to be used in the Figures and Tables in the main paper and online appendix

    These files are generated by running .R file 1-4, above, with the exceptions of these two files:

	out-a1.RData
	out-inscritosearlylate.RData

    These two files require identified data in order to be produced so they cannot be produced with the code provided. For transparency, we left the original code that produced these files in the 1_analysis_WW.R code (as comments), but we provide the pre-assembled object instead
 

Data --- folder containing the datasets used in the main analysis and in the appendix; see Codebooks.pdf for a description of the datasets. The datasets were originally created by _creating_data_W1_public.R and _creating_data_W2_public.R but  these required identified individual information that cannot be publicly shared. We therefore provide the code to create the files, but not the identified data. We provide, instead, these files with de-identified data.

	surveyW1-CPS.Rda
	surveyW2-CPS.Rda
	W1_attrition_overall_public.Rda
	W1_attrition_public.Rda
	W2_attrition_overall_public.Rda
	W2_attrition_public.Rda
	W2_attrition_admin_overall_public.Rda




## -------------------------------------------------- #
# additional notes 
## -------------------------------------------------- #

We do not provide the raw datasets containing private individual identifiers. 
Our scripts on creating the datasets (i.e. those files whose names begin with _) show our data manipulations, but the raw datasets are not available due to personal identifiable information.



## -------------------------------------------------- #
## end of file
## -------------------------------------------------- #

